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ABSTRACT 

An entire elementary school system with 60^ white and kOft black pupils 
was given several ability tests administered by 12 white and 8 black examiners 
(Bs). The tests measured verbal and nonverbal IQ» perceptual -motor cognitive 
development » "speed and persistence" under neutral and motivating instructions* 
listening -attent ion » and short-term rote memory for numbers* With the exception 
of the "speed and persistence" test* on which white Es yielded higher mean 
scores than black Bs for both white and black pupils » the results for the 
cognitive ability tests showed that the race of the B had unsystematic and 
negligible effects in the testing of white and black pupils* This conclusion 
applies to the IQ test results of both group-administered and individually*- 
administered tests* 




The Effect of Race of Examiner on the Mental 
Test Scores of White and Black Pupils^ 

Arthur R. Jensen 
University of Callfomla » Berkeley 

How often is it said that the riace of thfe examiner is an Important 
variable in the ability testing of ethnic minority children, or that black 
children obtain lower scores when tested by a white examiner? Sattler 
(1970), In a review of the research on this question, remarked: "In spite 
of the paucity of research concerning the effects of differences in racial 
status as a variable which affects the examiner -examinee relationship, 
numerous writers have either concluded or suggested that this variable 
may play an important role in the intelligence test situation." (p. Ikk). 
And in this and another review (Sattler & Theye» 1967» p. 3^3) Sattler 
cites a dozen references holding this beliefs including books and articles 
by such noted psychologists as Anastasi» Hilgard, Rlineberg, Pettigrew, 
Pressey» and Strong. 

Though the speculative claims of race of examiner (E) effects in 
intelligence testing are frequent in the literature of race differences, 
the total empirical research on the subject » eleven studies and a reanalysis 
of one of the8e» altogether constitute a rather unimpressive body of evidence* 
They are here briefly summarized chronologically. 

Canady (1936): On the first administration of I9I6 Stanford-Binet» 
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Ss obtained higher IQ with Es (l black and 20 whites) of their own race, 
while on re-test Ss obtained higher IQs with Es of the opposite race. 
Sattler (I966) reanalyzed the experiment and concluded the results are 
inconclusive because of methodological deficiencies. 

Pasamanick and Knoblock (1955)- A white E testing forty 2-year-old 
black Ss on the Gesell E'svelopmental Examination was claimed to have obtained 
lower verbal responsiveness scores than presumably would have been obtained 
by a black E, but no Ss were tested by a black E for comparison. 

Forrester and Klaus (190^): Twenty-four black kindergartners obtained 
nonsignificant ly higHer Stan ford-Bine t IQs when tested by a female black E 
than by a female white E. 

La Crosse (l96lf): A white E obtained significantly lower Stanford- 
Binet (L-M) retest scores when testing black Ss who had been previously 
tested by two black Es. The same white E obtained significantly higher 
retest scores with white Ss previously tested by three white Es. 

Pettigrew {l9&^): White Es (number not reported) are said to obtain 
fewer correct responses than black Es (number not reported) from Northern 
blacks given two tests (identification of six famous men and giving s3n:ionyms). 
No statistical tests of significance are reported. 

Miller and Phillips (1966): Three black and three white female Es 
testing black and white children in Head Start in the South resulted in 
no significant effects > either for race of E, or for the race of E X race 
of S interaction. 

Pelosi (1968): Six black and six white Es tested young adult 
black males enrolled in a Neighborhood Youth Corps, on the Wechsler Adult 
Intelligence Scale , the Purdue Pegboard, and the IPAT Culture Fair Test; 
no significant effects of race of E. 
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Abramson (I969): Two black and two white female Es gave Peabody 
Picture Vocabulary Test to Eastern black and white kindergartners and first- 
graders. No significant E effects for kindergartners, but white Es obtained 
higher scores from white Ss than from black; and black Es obtained similar 
scores from both groups* 

Lipsitz (1969): Lorge-Thomdike group-administered test-retest by 
one black and one white E showed no significant race of E or interaction 
effects in Eastern black and white kthf ^th, and 6th graders in private 
schools (unrepresentative samples). 

Caldwell and Knight (197O): Stanf ord-Binet test-retest with one 
black female E and one white male E produced no significant E effect on 
6th grade Southern black children. 

Costello (1970): Two white and two black Es giving the Peabody 
Pciture Vocabulary Test to black preschoolers resulted in no significant 
race of E effect. 

In concluding a detailed critique of this research, Sattler (in 
press) coimnented: "The studies reviewed . . . suggest that performance of 
Npgro and white subjects on individually administered intelligence tests 
is not usually affected by the examiner's race. However, there are still 
too few studies available to arrive at firm generalizations. Yet, as 
Sattler (I97O, p. Ikh) pointed out, numerous authorities have stated that 
difference in racial status is a variable which affects the examiner- 
examinee relationship. The research cited in this review as well as past 
research offers no support for this statement." 

Sattler (l970, p.lkk) also points out that "little is known about 
the effects of the examiners* race on scores obtained on group administered 
intelligence tests." To examine this matter in terms of available evidence, 



Shuey (I966) compared all the reported studies up to I965 (19 in all) of 
black IQ in elementary school children In the South, where the group test- 
ing was done by a black tester, with the test results obtained on all 
Southern black school children, the vast majority of whom were tested by 
white examiners. Shuey concluded: 

"The 2,560 elementary school children tested by Negroes earned a mean IQ 
of 80.9 as compared with a combined mean of 80.6 earned by more than 50,000 
Southern Negro school children, an undetermined but probably a large number 
of whom were tested by white Investigators. The present writer also calcu- 
lated the combined mean IQ achieved by 1,796 Southern colored high school 
pupils who were, tested by Negro adults. This was 82.9 as compared with a 
mean of 82.1 secured by nearly 9,000 Southern colored high school students, 
many of whom were examined by white researchers. From these comparisons It 
would seem that the Intelligence score of a Negro school child or high 
school pupil has not been adversely affected by the presence of a white 

tester" (p. 507)' 

Concerning Shuey 's analysis, Dreger and Miller (I968) stated: 
"Although Shuey, on the basis of finding no apparent differences In school - 
age Negro children, concludes that such race of E X race of S Interaction 
does not exist, we have offered some reason to think it does exist" (p.25). 
But the most obvious objection to Shuey's analysis Is that there was no 
control over the samples tested by black and white Es. If for some reason 
the less Intelligent black Ss were more likely to be tested by black Es 
(as might be the case In most rural Southern schools), the fact that the 
more Intelligent black Ss (more likely In urban schools) tested by white 
Es did not obtain higher IQs than the Ss tested by black Es might only mean 
that their performance had been depressed by the presence of a white E. 
In a proper study pains should be taken to avoid any such biasing factors 
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in the assignment of white and black Es to white and black Ss 

The present study was designed to control such factors and was con- 
ducted with large enough samples of both Es and Ss over a sufficient age 
range and with an adequate variety of mental ability tests as fully to per- 
mit the significant appearance of a race of E X race of interaction* Since 
statistical significance depends in part upon the sample size} and since the 
samples in this study are very large} it is much more important to evaluate 
the actual magnitudes of the examiner effects rather than merely to note 
their level of statistical significance* All population differences, of 
course, are significant^ and this study is based on virtually the entire 
elementary school population of a city of more than 100,000 population* 

Method 

Subjects 

The Ss were virtually the total white and black elementary school 
(kindergarten through 6th grade) population of the Berkeley Unified School 
District* A total of nearly 9,000 pupils in all classes of 17 schools 
were tested, with the exclusion only of children in special classes for 
the retarded, the emotionally disturbed, and the neurologically and 
physically handicapped* Since the present study focuses upon white-black 
interaction of race of examiners (E) and race of subjects (Ss), the 11 
percent of the school population who ar^^ Oriental or other ethnic minorities 
(about 1 percent) are not included in the analyses* (The total school popu- 
lation involved in this study is approximately 60^ white and ^0^ black*) 
Also not included are Ss who were absent on the day that a particular test 
Wias administered to their class* 

' Ss* ethnicity was determined from the school records, which included 

* 
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the parents* statement of the child's race, obtained when the chxld was 
enrolled in the Berkeley schools. 

Examiners 

There were 12 white (lO women and 2 men) and 8 black (6 women and 2 
men) Es* All were between 2^ and ^0 years of ago and all had either B« A« 
or M* A* degrees in psychology or education* A few were University gradu- 
date students in the school psychology program and nearly all of them had 
teaching credentials and had taught in public schools. They were selected 
from among some fifty applicants on the basis of qualifications and inter- 
views* They were paid at the daily rate for substitute teachers in the 
Berkeley schools* 

All Es were given copies of the tests and the manuals of instructions 
for administration to study prior to the three all-*day training sessions. 
These sessions, conducted by three professionals with training and experience 
in clinical and group testing, aimed to inculcate general principles of test 
administration as well as specific instructions and practice in the tests to 
be used* All testing procedures were demonstrated and all Es, working in 
small groups, had to practice the instructions and procedures in front of the 
group and the instructor who criticized and "shaped up" each E^s performance, 
in terms of voice, emphasis, pacing, rapport, and general manner of presenta* 
tlon* The importance of strict adherence to the standard instructions and 
time limits was emphasized repeatedly, and the psychometric rationale for 
this was thoroughly explained* Es were provided with stopwatches for the 
timed tests and were taught to operate the tape recorders used in two of 
the tests* Es were also instructed in filling but a special form at the 
conclusion of every test session concerning any unusual occurrences (e«g«, 
a fire drill) which might have created nontypical testing conditions* 
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All^Es were observed actually testing in the classroom at lease once, 
early in the testing program, by Dr. Egbert, our testing supervisor, or one 
of the other professionals on the staff, with the aim of maintaining as much 
uniformity of testing procedures as possible. 

All Es did not administer every one of the different tests used in 
this study, but every test was administered by white and black Es« 

Assignment of Es to Schools ana Classes 

The assignment of Es to schools and classes was random within race of 
E. That is, on any given day, one black E was assigned at random to each 
school until the supply of black Es was used up; the same was done for white 
Es. Thus, every school received , both white and black Es. These random 
assignments were made on a day-to-day basis, so that all Es had equal chances 
of testing in all schools. The particular classes to be tested at a given 
school on a given day also were assigned at random to the white and black Es. 

Tests 

A variety of quite different tests were used. They were expected 
possibly to elicit different degrees of sensitivity to examiner effects. 
There were standard verbal and nonverbal IQ tests, which involved consi- 
derable verbal instructions on the part of E, especially ;.n Grades K to 3* 
There was an untimed developmental perceptual -motor test; a "speed and per- 
sistence" test intended to reflect effort and motivation induced by verbal 
instructions in a test -taking situation; a test of Ss' ability to attend 
to verbally given directions; and a short-term memory test. Both of these 
latter two tests involved the presence and supervision of the E as a proctor, 
but were wholly administered and paced by means of a tape recording to insure 
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the greatest possible uniformity of administration* 

Lorge^Thomdlke Intelligence Iests » This Is a nationally standardized 
group -administered test of general Intelligence* In the normative sample » 
which was Intended to be representative of the nation *s school population » 
the test has a mean IQ of 100 and a standard deviation of l6* It Is 
generally acknowledged to be one of the best standardized paper -and -pencil 
tests of general Intelligence* 

The Manual of the Lorge*Thomdlke Test states that the test was 
designed to measure reasoning ability* It does not test proficiency In 
specific skills taught In schooly although the verbal tests > from Grade k 
and above » depend upon reading ability* The reading level required » however » 
Is Intentionally kept considerably below the level of reasoning required for 
correctly answering the test qiiestlons* Thus the test Is essentially a test 
of reasoning and not of reading ability^ which Is to say that it v;ould have 
more of Its variance In conmon with nonverbal tests of reasoning ability 
than with tests of reading per se * 

The tests for Grades K*3 do not depend at all upon reading ability 
but make use exclusively of pictorial Items* The tests for Grades k^Q 
consist of two parts » Verbal (V) and Nonverbal (NV)* They are scored 
separately and the raw score on each Is converted to an IQ» with a norma- 
tive mean of 100 and SD of l6* The chief advantage of keeping the two 
scores separate Is that the Nonverbal IQ does not overestimate or under-^ 
estimate the child *s general level of Intellectual ability because of 
specific skills or disabilities In reading* 

The following forms of the Lorge-^Thomdike Intelligence Tests were 

used : 

Level 1» Form B* Grades K*l 

Level 2» Form B* Grades 2*3 

Level 3» Form B Verbal and Nonverbal* Grades 4*6* 
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The "consumable** form of the test was used to obviate separate answer sheets 
and the added difficulty they may involve for the testees. 

Figure Copying Test s This test vat^ developed at the Gesell Institute 
of Child Study at Yale University as a means for measuring developnientr.l 
readiness for the traditional school learning tasks of the primary grades 
(li^ & Ames, I967). The test consists of the ten geometric forms arranged 
in order of difficulty. The child must simply copy them, each on a separate 
sheet of paper • The test involves no memory factor, since the figure to be 
copied is before the child at all times. The test is administered without 
time limit, although most children finish in 10 to 1^ minutes. The test is 
best regarded m a developmental' scale of mental ability. It correlates 
substantially with other IQ tests, but it is considerably less culture -loaded 
than most usual IQ tests. It is primarily a measure of general cognitive 
development and not just of perceptual^motor ability. Children taking the 
test are urged to attempt to copy every figuie. 

Each o£ the ten figures is scored on a 3 point scale going from 1 (low) 
to 3 (high). (A score of zero is given in the rare instance when no attempt 
has been made to copy a particular figure)- A score of 1 is given if an 
attempt has been made but the child's drawing completely fails to resemTile 
the model. A score of 2 is given if there is fair resemblance to the model— 
the figure need not be perfect but it must ie easily recognizable as the model 
which the child has attempted to copy. A score of 3 is given for an attempt 
which duplicates the figure in all its essential characteristics— this is an 
essentially adult level of performance. Since there are ten figures in all, 
the possible range of scores goes from 10 to 30 (or 0 to 30 if zeros are 
counted, but this is rare, since virtually all subjects attempt all ten 
figures. Scoring reliability, as determined from correlations between 
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different scores is abcve .90 

The high level of motivation naintained by this test is indicated by 
the fact that the minimum score obtained in each group at each grade level 
increases systematically with grade level • This suggests that all children 
were making an attempt to perform in accordance with the instructions. An- 
other indication that can be seen from the test booklets is that virtually 
100 percent of the children in every ethnic group at every grade level 
attempted to copy every figure. The attempts » even when unsuccessful » 
usually show considerable effort » as indicated by redrawing the figure » 
erasures and drawing over the figure repeatedly in order to improve its like* 
ness to the model. It is also noteworthy about this test that normal children 
are generally not successful in drawing figures beyond their mental age level 
and that special instructions and coaching on the drawing of these figures 
hardly Improves the child's performance. This test» in other words» is not 
very susceptible to trainings but measures some fundamental aspects of men- 
tal development. 

Listening ^Attention Test . In the Lis ten ing*At tent ion Test the child 
is presented with an answer sheet containing 100 pairs of digits in sets of 
10. The child listens to a tape recording which speaks one digit every two 
seconds. The child is required to put an X over the one digit in each pair 
which has been heard on the tape recorder. The purpose of this test is to 
determine the extent to which the child is able to pay attention to ntmters 
sp<^en on a tape recorder » to keep his place in the test» and to make the 
appropriate responses to what he hears from moment to moment. Low scores 
on this test indicate that the subject is not yet ready to take the Memory 
for Himibers test which immediately follows it. High scores on the Listen* 
ing-Atcention Test indicate that the stibject has the prerequisite skills 
for tidting the digit span (Memory for Numbers) test. The Listening -Attention 
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Test thus is intended as a means for detecting students who, for whatever 
reason, are unable to hear and to respond to numbers read over a tape 
recorder* The test itself makes no demands on the child's memory, but only 
on his ability for listening, paying attention, and responding appropriately- 
all prerequisites for the digit memory test that fellows* 

Memory for Numbers Test * The Memory for Numbers test is a measure 
of digit span, or more generally, short-term memory* It consists of three 
parts* Each part consists of six series of digits going from four digits 
in a series up to nine digits in a series* The digit series are presented 
on a tape recording on which the digits are spoken clearly by a male voice 
at the rate of precisely one digit per second* The subjects write down as 
many digits as they can recall at the conclusion of each series, which is 
signaled by a *'bong*" Each part of the test is preceded by a short practice 
test of three digit series in order to permit the tester to determine 
whether the child has understood the instructions, etc* The practice test 
also serves to familiarize the subject with the procedure of each of the 
subtests* The first subtest is labeled Immediate Recall (l)* Here the 
subject is instructed to recall the series immediately after the last digit 
has been spoken on the tape recorder* The second subtest consists of 
Delayed Recall (d)* Here the subject is instructed not to write down his 
response until after ten seconds have elapsed after the last digit has been 
spoken* The ten-second internal is marked by audible clicks of a metronome 
and is terminated by the sound of a bong which signals the child to write 
his response* The Delayed Recall condition invariably results in some 
retention decrement* The third subtest is the repeated series test, in 
which the digit series is repeated three times prior to recall; the subject 
then recalls the series immediately after the last digit in the series has 
been presented* Again, recall is signaled by a bong* Each repetition of 
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the series is separated by a tone with a duration of one second. The 
repeated series almost invariably results in greater recall than the single 
series. The Memory for Numbers test is very culture fair for children in 
second grade and beyond and who know their numerals and are capable of 
listening and paying attention, as indicated by the Listening*Attention 
Test. The maximum score on any one of the subtests is 39f that is, the 
sum of the digit series from four through nine. In the present study, the 
total score is used, i.e., the sum of the th.^e subtest scores. 

Speed and Persistence Test (Making X's). The Making X*s Test is 
intended as an assessment of test -taking motivation. It gives an Indica* 
tion of the subject's willingness to comply with instructions in a group 
testing situation and to mobilize effort it. following those instructions 
for a brief period of time. The test involves no intellectual component, 
although for young children it probably involves some percepttial*motor 
skills component, as reflected by increasing mean scores as a function of 
age between grades 1 to ^. The wide range of individual differences among 
children at any one grade level would seem to reflect mainly general moti- 
vation and test-taking attitudes in a group situation. The test also serves 
partly as an index of classroom morale, and it can be entered as a moderator 
variable into correlational analyses with other ability and achievement tests 
Children who do very poorly on this test, it can be suspected, are likely not 
to put out their maximum effort on ability tests given in a group situation 
and therefore their scores are not likely to reflect their "true" level of 
ability. 

The Making X's Test consists of two parts. On Part I the subject 
is asked simply to make X's in a series of squares for a period of 90 
seconds. In this part the instructions say nothing about speed. They 
merely instruct the child to make X's. The maximijm possible score on 
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Part I is 1^, since there are 1^ squares provided in which the child can 
make Vs. After a 2-iDinute rest period the child turns the page of the 
test booklet to Part II* Here the child is instructed to show how much 
better he can perform than he did on Part I and to work as rapidly as 
possible • The child is again given 90 seconds to make as many X's as he 
can in the I50 boxes provided. The gain in score from Part I to Part II 
reflects both a practice effect and an increase in motivation or effort 
as a result of the motivating instructions, i. e», instruction to work as 
rapidly as possible. 

Results and Discussion 

The basic analysis performed on each test at each grade level in 
which the test was administered is a nested ANOVA, with race of Es nested 
within race of Ss« Since there were unequal Ns in the four cells of the 
2X2 (race of E X race of S) design, the main effects for race of E are 
based on unweighted means for white and black Es; that is to say, in the 
overall means for white and black Es, equal weights are given to both means 
despite their unequal Ns. Otherwise the overall mean difference between 
white and black Es would be partly a function of the number of white and 
black Ss they had tested, because there is a substantial main effect for 
race of Ss* 

So that the magnitudes of the differences can be readily compared 
from one grade to another and from one test to another, all differences 
have been expressed in sigma units. The sigma in every case is the standard 
deviation of test scores within groups, i«e«, the standard deviation 
excluding variance due to race of E, race of Ss, and their interaction. 
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Table 1 shows the Ns of Es and Ss for each of the tests at each grade. 



Insert Table 1 about here 



Lorge-Thorndike IQ 

Tables 2 and 3 show the results for Lorge-Thorndike Nonverbal and 
Verbal IQs, respectively* For Nonverbal IQ, the main effect of race of 
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E, as we can see in the first column, is very small; only at Grades 1 and 
2 is the difference significant, and it amounts to less than a fifth of a 
standard deviation, or less than 3 IQ points. (Negative numbers always 
indicate that the black mean exceeded the white.) The overall mean differ- 
ence between white and black Es (shown in the last two rows of the first 
column) is very small and nonsignificant even for these very large samples. 
The unweighted mean (X) here is the simple arithmetic average of the means 
of every grade; the weighted mean is t>3 average of the means of every grade, 
each weighted by the total number of Ss on which the mean is based. Since 

the Ns are usually similar from one grade to another, the weighted and 

tmweighted means do not differ appreciably. 

Columns 2 and 3 show the mean differences between white and black 

Es within each racial group of Ss. Again, these differences are very small, 

and overall they are nonsignificant for the Nonverbal test. Column k shows 
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Table 2 

Lorge-Thorndike Nonverbal IQ: 
Mean Differences in Sigma Units 





Mean W-B 
E Difference 


Mean W-B 
B Difference 


Mean W-B ^ 
S Difference 


Between -Es ^ 
Within Groups 


Grade 


nr • — ■ 


White Ss 


Black Ss 






K 


.062 


.062 


.062 


1.19 


.252 


1 


-.199** 


..219** 


-.180 


1-57 


.k66 


2 


-.166** 


* 

-.192 


-.11+0 


1.51 


.525 


5 


-.011 


.089 


-.112 


1.55 


.569 


I* 


.025 


.295** 


-.2k9* 


1.65 


.592 


5 


.080 


.105 


.056 


1.75 


.696 


6 


.065 


.159 


-.029 


1.75 


.517 


Unweighted X 


-.021 


.0»*5 


-.085 


i.J+7 


.1+59 


Weighted X 


-.026 


.0»*7 


-.088 


l.J+7 


.1+60 



*A11 differences significant beyond .01. 
'^ol tested for significance. 
^Sij^nificant at £ < .05. 
Significant at £ <.01. 



Table 5 
Lorge-Thorndil'e Verbal IQ: 
Mean Differences in Sigca Units 



Grade 


Mean W-B 
E Difference 


Mean W-B 
E Difference 


Mean W-B ^ 
S Difference 


Between Es ^ 
Within Groups 




White Ss 


Black Ss 






k 


-.005 


-.015 


.007 


1.59 


.680 


5 


AOl^** 


.571** 


A57** 


1.60 


AI5 


6 


.296** 


.k22** 


.170 


1.95 


.710 


Unweighted X 


.252** 


.260** 


.205** 


1.71 


.602 


WeightedT 


.255** 


.265** 


.209** 


1.71 


.602 



^All differences significant beyond .01. 

^Not tested for significance. 
^H^Significant at ]i < •Ol. 
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the race difference between Ss, against which one can compare the magnitudes 
of the differences shown in the other colximns. 

The last column shows the variation among Es within groups, that is, 
variation among Es not attributable to race of E or race of Ss or the 
interaction of these variables. This variation among Es is expressed as 
the standard deviation of the means of Es within groups divided by the 
standard deviation of Ss within groups. It should be noted, however, that 
some appreciable part of the variation among Es reflects differences between 
schools and classrooms, which inevitably results from the random assignment 
of a relatively small number of Es to a diversity of schools and classes. 
The interschool and interclass variations do not have a chance to "average 
out" over Es under the conditions of the present study. The between E 
variation allows no meaningful test of statistical significance but is 
presented here merely as a basis for comparing and evaluating the magnitudes 
of the other differences. 

These general comments serve as well for Tables 6 through 10. 
On the Verbal IQ (Table 5) we see that the race of E differences, 
both for the main effect and within groups, show the white Es obtaining 
slightly higher scores by about one-fourth to one-fifth of a sigma. But 
the net effect of race of E is an overall mean difference between white 
and black Ss of only about .05 sigma, ^i. less than one IQ point, as compared 
with the overall 1.71 sigma difference between the white and black Ss means. 

Individually Administered IQ Tests . To determine if individual 
administration of the IQ tests would result in significant race of E by 
race of S interaction, a number of Ss were tested individually on the 
Lorge-Thorndike Tests. At th,. time that a group test was administered 
to a class, one child in the class was taken at random from the class to 
be given the same test individually, either in a private room in the school 
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or in a testing van parked on the school grounds. The child was selected 
by random numbers from the class roll, irrespective of race or sex, prior 
to the time of testing; in f.he case of absentees, there was always an alter- 
nate who had been selected by the same random procedure. Every single class 
in the school system was thus represented by one child selected at random 
for an individual, test. Equal numbers of white and black Es were assigned 
at random as individual testers. 

The essential results are shown in Tables k and 5. A two-way ANOVA 



Insert tables k and 5 about here 



was done at each grade. The race of S main effect was significant beyond 
the .01 level in every grade. Tables k and 5 give the exact £ values of 
F in the ANOVAs for the race of E main effect and for the race of E X race 
of S interaction. With one exception (Verbal IQ at Grade 5) the £ values 
of the interaction fall far short of significance, and in general the race 
of E has no appreciable or systematic effect on the individually administered 
tests. 

Fi gure Copying Test 

Results for the Figure Copying Test are shown in Table 6. The race 



Insert Table 6 about here 



of B effects can be seen to be quite small and unsystematic. 



Table 5 



Lorge-Thorndlke Verbal IQ: Individual Tests 



Grades 
Es 


1^ 5 6 All Grades 
Ss Ss Ss Ss 
WB W B W~B WB 


W 

N 

B 


21 8 
5 10 


15 16 

7 7 


13 7 
5 11 


1+9 31 
17 28 


W 

: X 

B 


III+.5 92.6 
118.2 9lf.5 


123.1 83.9 
103.1 98.6 


117.3 87.9 
110.2 85.9 


117.9 87.0 
109.6 92.1 


P value 
Race of E 

Interaction 
E X S 


.52 
.83 


.58 
.00 


.1+6 
.66 


.59 
.00 



Table 6 
Figure Copying Test: 
Mean Differences In Sigma Units 



Grade 


nean n-o 

E Difference 


Mean U-B 
E Difference 


Mean W-B ^ 
S Difference 


Between Es ^ 
Within Groups 




White S8 


Black Ss 






K 


.002 


-.015 


.019 


1.00 


.317 


1 


-.076 


-.009»* 


-.IM* 


.95 


.3U3 


2 


-.0U8 


-.079 


-.017 


.85 


.1*20 


. 3 




.35»*** 


.185 


.87 


.539 


k 


.078 


.237** 


-.081 


.99 


.521 


Unweighted X 


.01*5 


.098** 


-.008 


.93 


.1*28 


Weighted X 


.037 


.085* 


-.015 


.93 


.1*21 



*A11 differences significant beyond .01. 

^Mot tested for significance. 

*Slgnlficant at £ < .05. 
*»Signlf leant at £ < .01. 
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Speed md Persistence Test (Making Xs )* 

It is interesting that this test, which was devised to reflect Ss* 
attitude and effort in a test situation and to be sensitive to motivating 
instructions, does in fact show far larger E effects than any of the other 
tests used in this study, differences amounting to half a standard deviation 
or more. It also shows by far the smallest overall racial difference between 
Ss of any of the tes^ts. The consistently significant race of B effects 
uniformly favor the white Bs. The neutral and motivating instructions do 



Insert Tables 7 *nd 8 about here 



not appear to produce any diffemces with respect tc the race variables. 
However, it should be remarked that performances under the motivating 
instructions is significantly higher for all groups than under the neutral 
instructions* 

Listening-Attention and Memory for Nuaa)ers 

These two tests were expected to show the smallest E effects, since 
their administration was wholly by means of a "ape recording and involved 
the Es only as proctors and distributors of test forms. This expectation 
is borne out by the results shown in Tables 9 and 10. 



Insert Tables 9 10 about here 



Table 7 

Speed end Pertlttence Tett**Flrft Try (Neutral Instructions): 
Mean Differences In Slgna Units 



Grade 


Mean W-B 

E Difference 


Mean W-B 

B Difference 


Mean W-B 

S Difference 


Between Es 
Within Group** 






White Ss 


Black ss 






I 


•I78** 


.287** 


.069 


• 56** 


.812 


2 


.508»* 


,570»» 


.693** 


-.09 


.76U 


3 




,588»* 


,k96** 


-.20»* 




k 




1.185»* 


1.109»« 


'M** 


1.0lf3 


5 




• 650^ 


,ky)** 


.10 


1.062 


6 


.500»* 




.362»« 


.17** 


.982 


Unweighted X 


.571** 


,(20** 


,530»« 


.02 


.919 


Weighted X 


,362** 


,6lk** 


.526»« 


' .03 





Not tested for significance. 
**Slgnlf leant at £ < •01. 



Table 8 

Speed and Persistence Test--Second Try (Motivating Instructions): 

Mean Differences in Sigma Units 



Grade 


Mean W-B 
E Difference 


Mean W-B 
E Difference 


Mean W-B 
S Difference 


Between Es ^ 
Within Groups 






White Ss 


Black Ss 






1 


.26^ 




.070 


.55** 


.793 


2 


.617** 




.59!^** 


.07 


.727 


5 


.685** 


1.015** 


.557** 




.818 


k 


1.019»* 


1.265** 


.775** 


-.55** 


1.086 


5 


.587** 


.621**^ 


.125 


-.05 


1.093 


6 


A77** 


.650** 


.521+** 


-.05 


1.0l»5 


Unweighted X 




1 .771** 


.57*+** 


-.05 


.927 


Weighted X 


.570** 


! .766** 


.57^+** 


-.02 


.921 



^ot tested for significance. 
Significant at £ < .01. 



Table 9 
Listening-Attention Test: 
Mean Differences in Sigma Units 



Grade 


1 

Mean W-B 
B Difference 


Mean W-B 
E Difference 


Mean W-B ^ 
S Difference 


Between Es ^ 
Within Groups 






White Ss 


Black Ss 






2 


-.055 


.020 


-.131 


.23 


.119 


3 


-.288** 


-.039 


.022 


.36 


.112 


h 


-.121 


.036 


-.279** 


.32 


.11+1 


5 


.059 


.0»+6 


.071 


.18 


.1+61 


6 


.lJ+5* 


.098 


.192* 


.19 


i .203 


Unweighted X 


-.052 


.032 


-.025 


.25 


i .207 


Weighted X 


-.053 


.031 


-.030 


.25 


j .203 



^All differences significant beyond ,01 • 
^Not tested for significance. 
^Significant at £ < •05, 
Significant at £< .01, 
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Table 10 
Memory for Numbers Test: 
Mean Differences in Sigma Units 



ERIC 



Grade 


Mean W-B 
E Difference 


Mean W-B 
E Difference 


Mean W-B ^ 
S Difference 


Between Es ^ 
Within Groups 






White Ss 


Black Ss 






2 


-.070 


-.229** 


-.125 


.61 


.111* 


3 


.161* 


.118 


.201* 


.58 


.201 


k 


.062 


-.055 


.180 


.59 


.215 


5 


.105 


.063 


.11*7 


.67 


.251* 


6 


-.002 


-.11*3 


.139 


.72 


.236 


Unweighted X 


.051 


-.01*9 


.109 


.63 


.201* 


Weighted X 


.01*8 


-.055 


.100 


.63 


.201 



*A11 differences significant beyond •01. 



Not tested for significance. 



Significant at £ <.01. 
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Conclusion 

Readers must evaluate the magnitudes of the race of E effects not 
so much in terms of their level of statistical significance but in relation 
to the magnitudes of the other sources of variance in test scores and in 
relation to the size of differences in tnental test scores that are of 
practical or theoretical consequence in any particular context. From the 
present results for the tests of cognitive ability (i.e., excepting the 
Speed and Persistence Test), it seems safe to conclude that for all prac- 
tical purposes the race of the examiner is of negligible consequence in 
the testing of white and black school children. 
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